Group sequential designs for pragmatic clinical trials with early outcomes: methods and guidance for planning and implementation

Background Group sequential designs are one of the most widely used methodologies for adaptive design in randomized clinical trials. In settings where early outcomes are available, they offer large gains in efficiency compared to a fixed design. However, such designs are underused and used predominantly in therapeutic areas where there is expertise and experience in implementation. One barrier to their greater use is the requirement to undertake simulation studies at the planning stage that require considerable knowledge, coding experience and additional costs. Based on some modest assumptions about the likely patterns of recruitment and the covariance structure of the outcomes, some simple analytic expressions are presented that negate the need to undertake simulations. Methods A model for longitudinal outcomes with an assumed approximate multivariate normal distribution and three contrasting simple recruitment models are described, based on fixed, increasing and decreasing rates. For assumed uniform and exponential correlation models, analytic expressions for the variance of the treatment effect and the effects of the early outcomes on reducing this variance at the primary outcome time-point are presented. Expressions for the minimum and maximum values show how the correlations and timing of the early outcomes affect design efficiency. Results Simulations showed how patterns of information accrual varied between correlation and recruitment models, and consequentially to some general guidance for planning a trial. Using a previously reported group sequential trial as an exemplar, it is shown how the analytic expressions given here could have been used as a quick and flexible planning tool, avoiding the need for extensive simulation studies based on individual participant data. Conclusions The analytic expressions described can be routinely used at the planning stage of a putative trial, based on some modest assumptions about the likely number of outcomes and when they might occur and the expected recruitment patterns. Numerical simulations showed that these models behaved sensibly and allowed a range of design options to be explored in a way that would have been difficult and time-consuming if the previously described method of simulating individual trial participant data had been used. Supplementary Information The online version contains supplementary material available at 10.1186/s12874-024-02174-w.


A.1 Uniform correlation model
In the most general setting, we assume independence between participants with a multivariate normal distribution for outcomes (y ij1 , . . ., y ijs ), with mean (µ j1 , . . ., µ js ) and covariance matrix , where σ r is the standard deviation of the outcome at occasion r and ρ rr ′ is the correlation between endpoints at occasions r and r ′ = 1, . . ., s.For the uniform correlation model, R is a s × s correlation matrix given by , after setting ρ rr ′ = α for all occasions r = 1, . . ., s and r ′ = 1, . . ., s when r ̸ = r ′ and ρ rr ′ = 1 when r = r ′ .
The variance of the model parameters var(β) is given by var , where Σ i (σ, α) is the covariance matrix of y i for for participant i, characterised by parameters σ (σ 1 , . . ., σ r ) and α, X i is a r × 2s design matrix and β is a 2s × 1 vector of unknown model parameters.β can be structured, for convenience, such that β = (β 10 , β 20 , . . ., β s0 , β 1 , β 2 , . . ., β s ), where β s0 estimates the outcome mean in the control arm of the study, and β r estimates the effect of the treatment arm relative to the control arm at time-point d r .Therefore, β s is the effect of the treatment on the study outcome at time-point d s (the primary endpoint).Noting that Σ −1 = S −1 R −1 S −1 and , where a = −(1 + (s − 2)α) and b = α.Assuming that the number of participants with outcome data are structured such that N 0 where N 0 r is the number of participants in the control arm and N 1 r is the number in the treatment arm at occasion r, and after some algebraic manipulation, we can write

A.2 Exponential correlation model
Using the same arguments as in section A.1, for the exponential model R s is a s × s correlation matrix given by , and .
Therefore, assuming that the outcome data are structured such that N 0 , and after some algebraic manipulation, we can write For the fixed recruitment rate model (Section 3.1), where g r (t, d r ) = (t − d r ), the partial derivatives of V exp s , see expression (15), with respect to d s−m are where m = 1, . . ., s − 2.

A.4 Recruitment and follow-up models
Setting the recruitment period T R to be a multiple m of the primary (final) outcome time d s , T R = md s , where m > 1. Typically in pragmatic clinical trials the length of recruitment might be two (m = 2), three (m = 3) or four (m = 4) times the final outcome time-point; e.g. for a trial with a 12 month outcome then recruitment might take 24, 36 or 48 months to complete.If t f is the time period, after the study primary outcome, when an interim analysis at time t occurs (i.e.t f = t − d s ), then from Section 3.1, for the fixed rate recruitment model At an interim look at information fraction τ 0 (i.e.τ 0 = N s (t, d s )/N ) we might typically require some proportion τ 0, where 0 < τ 0 ≤ 1, of the study participants to have primary outcome data available for analysis and for the fixed rate recruitment model this will be at time t f = τ 0md s after the study primary outcome time at d s .The full study length will be given by d s + t f = d s + τ 0md s for τ 0 = 1 which is d s + T R , when follow-up on the final participant recruited at T R is complete.If t i is the time period, after the study primary outcome, when an interim analysis at time t occurs (i.e.t i = t − d s ), then from Section 3.2, for the increasing rate recruitment model .
As we require the interim analysis for the increasing rate model to be at the same value of τ 0 as used for the fixed rate model we can set τ 0 = t f /md s for N s (t, d s )/N in the above to get the expression t 2 i + t i − t f (md s + 1) = 0 which has solution As a check, we note that when τ 0 = 1 (and t f = md s ) the study has completed follow-up and then from the above expression t i = t f , as we would expect.
If t d is the time period, after the study primary outcome, when an interim analysis at time t occurs (i.e.t d = t − d s ), then from Section 3.3, for the decreasing rate recruitment model As we require the interim analysis for the increasing rate model to be at the same value of τ 0 as used for the fixed rate model we can set τ 0 = t f /md s for N s (t, d s )/N in the above to get the expression −t 2 d + t d (2md s + 1) − t f (md s + 1) = 0 which has solution As a check, we note that when τ 0 = 1 (and t f = md s ) the study has completed followup and then from the above expression t d = t i = t f , as we would expect.